Towards Coherent Multi-Document Summarization

نویسندگان

  • Janara Christensen
  • Mausam
  • Stephen Soderland
  • Oren Etzioni
چکیده

This paper presents G-FLOW, a novel system for coherent extractive multi-document summarization (MDS).1 Where previous work on MDS considered sentence selection and ordering separately, G-FLOW introduces a joint model for selection and ordering that balances coherence and salience. G-FLOW’s core representation is a graph that approximates the discourse relations across sentences based on indicators including discourse cues, deverbal nouns, co-reference, and more. This graph enables G-FLOW to estimate the coherence of a candidate summary. We evaluate G-FLOW on Mechanical Turk, and find that it generates dramatically better summaries than an extractive summarizer based on a pipeline of state-of-the-art sentence selection and reordering components, underscoring the value of our joint model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From Single to Multi-document Summarization: A Prototype System and its Evaluation

NeATS is a multi-document summarization system that attempts to extract relevant or interesting portions from a set of documents about some topic and present them in coherent order. NeATS is among the best performers in the large scale summarization evaluation DUC-01.

متن کامل

From Single to Multi-document Summarization

NeATS is a multi-document summarization system that attempts to extract relevant or interesting portions from a set of documents about some topic and present them in coherent order. NeATS is among the best performers in the large scale summarization evaluat ion DUC 2001.

متن کامل

Multi-document Summarization via Information Extraction: A Revisit

This paper describes a novel approach of improving multi-document summarization based on cross-document information extraction (IE). We first show that IE itself is not sufficient to produce fluent and coherent summaries. Then we attempt various methods to automatically incorporate IE results into sentence ranking. Experiments have shown our integration methods can significantly improve a high-...

متن کامل

A Hybrid Hierarchical Model for Multi-Document Summarization

Scoring sentences in documents given abstract summaries created by humans is important in extractive multi-document summarization. In this paper, we formulate extractive summarization as a two step learning problem building a generative model for pattern discovery and a regression model for inference. We calculate scores for sentences in document clusters based on their latent characteristics u...

متن کامل

The effects of analysing cohesion on document summarization

This paper describes a framework for multi-document summarization which combines three premises: coherent themes can be identified reliably; highly representative themes, running across subsets of the document collection, can function as multi-document summary surrogates; and effective end-use of such themes should be facilitated by a visualization environment which clarifies the relationship b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013